Improved polar-code-based efficient post-processing algorithm for quantum key distribution

Combined with one-time pad encryption scheme, quantum key distribution guarantees the unconditional security of communication in theory. However, error correction and privacy amplification in the post-processing phase of quantum key distribution result in high time delay, which limits the final secret key generation rate and the practicability of quantum key distribution systems. To alleviate this limitation, this paper proposes an efficient post-processing algorithm based on polar codes for quantum key distribution. In this algorithm, by analyzing the channel capacity of the main channel and the wiretap channel respectively under the Wyner’s wiretap channel model, we design a codeword structure of polar codes, so that the error correction and privacy amplification could be completed synchronously in a single step. Through combining error correction and privacy amplification into one single step, this efficient post-processing algorithm reduces complexity of the system and lower the post-processing delay. Besides, the reliable and secure communicaiton conditions for this algorithm has been given in this paper. Simulation results show that this post-processing algorithm satisfies the reliable and secure communication conditions well.

www.nature.com/scientificreports/ applied polar codes, whose code rate has been proved to achieve Shannon limit, to QKD, and discussed the feasibility. Later research 28 shows under short code length, the efficiency of polar codes is higher than LDPC codes' . In the past several years, the application of polar codes in QKD system has drawn the attention of researchers [29][30][31][32][33][34][35][36] .
In the aspect of privacy amplification, at present, a universal class of hash functions 37 was widely used in information compression to guarantee the security of secret key. However, due to its high computation complexity, this scheme has high time delay. To lower the time delay, researchers applies Toeplitz hashing, which becomes the most widely used privacy amplification method in recent years 25,[38][39][40] . By combining Toeplitz hashing with fast Fourier transform, researchers has reduced the computation complexity of Toeplitz hashing to O(nlogn).
To provide a new idea to reduce the complexity and lower the time delay of the post-processing phase in QKD systems, a polar-code-based efficient QKD post-processing algorithm is proposed in this paper. Using Wyner's wiretap channel model, we design a codeword structure of polar codes which satisfies the reliability and security for QKD post-processing. This idea has been applied to different communication systems in recent years 41,42 . By doing this, the error correction and privacy amplification which are the most time-consuming steps in the QKD post-processing could be completed synchronously in a single encoding and decoding process. Therefore, the complexity and time delay of post-processing can be reduced, and the final key generation rate can be improved. This will help with breaking through the bottleneck of realizing high-speed QKD system and promote practicability of QKD.
In 2019, we proposed polar codes-based one-step post-processing for quantum key distribution in our previous work 43 . However, there are three main drawbacks in our previous work. First, the security condition (see Eq. (5) in 43 ) is inaccurate and ambiguous. Thus we modify the security condition in this paper (see Eq. (5) in this paper). Second, the protocol in 43 is incomplete which may result in decoding failure and insecurity (see the steps 1 to 10 and Fig. 3 in 43 ). In this paper, we modify the protocol (see the steps 1 to 10 and Fig. 4 in this paper), which makes it more reliable and secure. The last but the most important point is that our previous work lacks experimental verification, since we only calculated the coding rate, and analyzed the reliability and security in theory. In this paper, we verify the reliability and security of the protocol through a large number of simulation experiments (see the whole section-"Simulation results").
The rest of this paper is organized as follows. In second section, we introduce the basic theory about Wyner's wiretap channel model, the secrecy capacity of discrete variable QKD (DVQKD) systems and polar codes. Then in third section, polar-code-based efficient QKD post-processing algorithm is introduced, after which we illustrate the reliability and security for the polar-code-based efficient QKD post-processing algorithm. The fourth section gives the simulation experiment result on code rate, decoding reliability and com1munication security. In last section, we summarize our work.

Basic theory
Wyner's wiretap channel model. The goal of secret communication is to realize reliable and secure information transmission between two authentic communication sides even under eavesdropping. The channel under eavesdropping can be depicted by Wyner's wiretap channel model 44 which is shown in Fig. 1. Authentic information sender Alice encodes the original information U of length k to code X of length n and sends code X to authentic information receiver Bob through the main channel W, after which Bob gets information Y. In the meantime, eavesdropper Eve eavesdrops through the wiretap channel W * and gets information Z. After decoding, Bob gets the estimation U ′ of original information U and Eve gets the estimation U ′′ .
In the Wyner's wiretap channel model, when the wiretap channel W * is degenerative with respect to the main channel W (that is to say, the channel capacity of the wiretap channel C(W * ) is smaller than the channel capacity of the main channel C(W)), with the code length tending to infinite, one can design a secure coding scheme which satisfies the communication reliability and security. Furthermore, the largest code rate is equal to the secrecy capacity C sec which is defined by C sec ≡ C(W) − C(W * ) . In other words, for all ǫ > 0 , there exist coding schemes of rate R ≥ C sec − ǫ that asymptotically achieve both the reliability and the security objectives 45 . Here, the reliability is measured by the decoding bit error rate (BER) of Bob, and the security is measured by the mutual information of U ′′ and U. Reliable communication means that  Channel capacity of DVQKD post-processing systems under Wyner's wiretap channel model. In QKD systems, after qubit transmission and sifting, Alice obtains sifted key KA sifted and Bob obtains sifted key KB sifted . Due to the defect of devices, channel noise and possible eavesdropping in the practical QKD system, in general, KA sifted = KB sifted . Namely, there are error bits. Denote the bit error rate in practical QKD system by p.
DVQKD is the maturest and the most widely used QKD system. For those DVQKD systems which apply BB84 protocol, their qubit transmission channel can be regarded as binary symmetric channel (BSC). Under this assumption, the mutual information between Alice and Bob is where h 2 (·) is binary entropy function 22 . Considering the maximum safety of communication, we can regard all the noise in practical systems results from eavesdropping. Hence, all information Eve can obtain is at most If we adopt Wyner's wiretap channel model to depict QKD system, the channel capacity of main channel W is the channel capacity of the wiretap channel is and the secrecy capacity is The secrecy capacity is equal to the secure final key generation rate k th 2 . Practical DVQKD systems require that k th = 1 − 2h 2 (p) ≥ 0 . This means that the value range of QBER p is [0, 0.11] and C(W) ≥ C(W * ) . Hence, according to the Wyner's wiretap channel model theory, within this range of p, channel W * between Alice and Eve is degenerative to channel W between Alice and Bob, and we can design a coding scheme which achieves the secrecy capacity. The rest of this paper is based on this prerequisite. Polar codes. Polar codes are the only coding scheme which has been proved in theory that their code rate can achieve Shannon limit 46 . Besides, the encoding and decoding complexity of polar codes is relatively small compared with LDPC codes 46 . Through recursively polarizing N independent identically distributed (i.i.d.) channels whose capacity are all C, one can get N coordinate subchannels whose capacity polarizes -with the growth of code length N, the capacity of N · C coordinate subchannels asymptotically tends to 1, while the capacity of the other N · (1 − C) coordinate subchannels asymptotically tends to 0. That is to say, the former N · C coordinate subchannels are optimized and the latter N · (1 − C) coordinate subchannels are degraded. The optimized channels will be used to transmit information bits and the degraded ones will be used to transmit frozen bits. Hence, the code rate asymptotically achieves the channel capacity which equals to N · C.
Denote the original N i.i.d. channels by W. As shown in Fig. 2, through channel combining in a recursive way, we get the combining channel W N of all N i.i.d. channels. Then through channel splitting, we can obtain N coordinate subchannels W (i) N 46 . The superscript (i) means the ith subchannel. In the rest of this paper, 1 ≤ i ≤ N. Under finite code length N, we need to evaluate the channel quality of each coordinate subchannel. According to the channel quality, we rank all coordinate subchannels in descending order. Then, the first K of them are chosen to transmit information bits according to concrete error correction requirement. In this way, the construction of polar codes is fulfilled. It's noticeable that the determination of K will impact the reliability and www.nature.com/scientificreports/ the code rate of the code structure we design -if K is too high, the decoding reliability will be unacceptable; if it is too low, the channel-capacity-reachable characteristic of polar code cannot be fully used and hence the code rate will be unsatisfactory. K can be determined by setting target frame error rate (TFER, it is a predefined value which Alice and Bob try to make the practical frame error rate of their communication lower than through error correction), which is used in our algorithm in "Simulation results". At present, there are several ways to realize the construction of polar codes [46][47][48][49] . In this paper, we adopt Tal's method 47 to construct polar codes, in which the probability of error P e (W

Polar-code-based efficient post-processing algorithm
Error correction and privacy amplification are two crucial steps in QKD post-processing. The goal of error correction is to eliminate the difference between Alice's sifted key KA sifted and Bob's sifted key KB sifted through information exchange between Alice and Bob, so that they can obtain the information which is equal to the capacity C(W * ) of the main channel. The goal of privacy amplification is to compress the exchanged information between Alice and Bob to remove the information Eve can obtain, which is equal to the capacity C(W * ) of wiretap channel.
Aiming at these two functions of the two crucial steps, we propose an efficient post-processing algorithm which can fulfill error correction and privacy amplification at the same time. This algorithm is called polar-codebased efficient post-processing (PCEP) algorithm. The concrete steps of PCEP are as follows. Denote the TFER by FER target , the target privacy amplification index (TPAI, it is a predefined value which Alice and Bob try to make the practical privacy amplification index lower than. Privacy amplification index is the leaked information rate, which is equal to the amount of leaked information leaked in a single code block divided by the code block length) by PAI target .

Steps of PCEP algorithm. Step 1: Parameter estimation
Alice and Bob compare the bases they use in the qubit transmission phase and get their own sifted key KA sifted and KB sifted . Then they choose some bits from their own sifted key to estimate the bit error rate p m (to distinguish the indexes of main channel and wiretap channel, we write an "m" in the subscript to represent that this index belongs to "main channel" or a "w" to represent that this index belongs to "wiretap channel") in the main channel as in other common post-processing algorithm . If p m exceeds the security threshold, they abort this key distribution, or else they enter into next step.
Step 2: Polarization of the main channel Alice and Bob polarize the main channel W by Arikan's method 46 and obtain N coordinate subchannels W  www.nature.com/scientificreports/ N ) in ascending order, and chooses the first K m coordinate subchannels which satisfy Eq. (11) to compose the optimized channel set G N (W, FER target ) . The rest of coordinate subchannels compose the degraded channel set B N (W, FER target ).
That is to say, Alice and Bob divide all coordinate subchannels in the main channel to two sets: From Eqs. (12) and (13), we can see that G N and B N are functions of W and FER target . This is why we write G N as G N (W, FER target ) and B N as B N (W, FER target ) . For convenience, G N and B N will be used in the rest of this paper.
Step 5: Polarization of the wiretap channel Alice and Bob polarize the wiretap channel W * by Arikan's method 46 and obtain N coordinate subchannels W (i) N .
Step 6: Channel quality evaluation in the wiretap channel Alice and Bob calculate the bit error rate p w of wiretap channel according to I AE = 1 − h 2 (p w ) = h 2 (p m ) as mentioned in "Polar-code-based efficient post-processing algorithm". Then they take p w as the channel quality index of the wiretap channel, according to which they adopt Tal's polar codes construction algorithm 47  The channel capacity C w (W * (i) N ) is used to evaluate the channel quality of each coordinate subchannel, the higher the better.
Step 7: Optimized coordinate subchannels selection in the wiretap channel Alice and Bob sort all coordinate subchannels W * (i) in ascending order and chooses the first K w ones which satisfy Eq. (15) to compose degraded channel set B * N (W * , PAI target ) with respect to Eve. The rest of coordinate subchannels compose optimized channel set G * N (W * , PAI target ) with respect to Eve.
That is to say, Alice and Bob divide all coordinate subchannels in the wiretap channel to two sets: From Eqs. (16) and (17), we can see that B * N and G * N are functions of W * and PAI target . This is why we write G * N as G * N (W * , PAI target ) and B * N as B * N (W * , PAI target ) . For convenience, G * N and B * N will be used in the rest of this paper.
Step 8: Determination of code structure After the above steps, Alice and Bob obtain four sets of coordinate subchannels. The first set G N is the optimized coordinate subchannels to Bob, the second set B N is the degraded ones to Bob, the third set G * N is the optimized ones to Eve, and the last set B * N is the degraded ones to Eve. As shown in Fig. 3, the subchannels which belong to B N must belong to B * N , and the ones which belong to G * N must belong to G N . This is because that the wiretap channel is degenerative with respect to the main channel. Therefore, those subchannels which are degraded to Bob must be degraded to Eve, and those which are optimized to Eve must be optimized to Bob. Hence, G N and B * N have intersection. Based on the above analysis of the four sets G N , B N , G * N , and B * N , Alice and Bob can redivide all subchannels into three sets without intersection as follows.
Alice and Bob choose the subchannels in A to transmit the information bits (in this situation, they are the bits of secret key), the subchannels in R to transmit random bits, and the subchannels in B to transmit frozen bits. By this redivision, the code structure is determined. Notice that, actually, all the code construction work, including (11) i UP e,m (W (i) N ) ≤ FER target . www.nature.com/scientificreports/ steps 4 to steps 8, can be done by Alice alone. Once Alice finish this work, she will transmit the code structure to Bob. Hence, Fig. 4 has not shown that Bob joins in the code construction work.
Step 9: Code transmission Alice randomly generates the bits which belong to R, sets all bits which belong to B to zero, and puts KA sifted into the bits which belong to A. Then she connects them according to the order of corresponding coordinate subchannels to form the original code. After encoding the original code by systematic polar coding algorithm 50 , Alice gets code CW enc . As shown in Fig. 4, CW enc is composed of CW chk1 enc , CW final enc (under systematic polar coding, CW final enc = KA sifted ) and CW chk2 enc , which are the systematic polar encoding results of the bits belong to R, A, and B, respectively. Alice only sends the check bits CW chk1 enc and CW chk2 enc to Bob through classical public channel.
Step 10: Error correction Bob puts his sifted key KB sifted into the bits which belong to A, puts CW chk1 enc into the bits which belong to R, and puts CW chk2 enc into the bits which belong to B. Then he decodes this bit string to get CW    enc . Because the coordinate subchannels in set A is optimized to Bob but degraded to Eve, the code structure which is determined in step 8 is optimized to Bob but degraded to Eve. Hence, with the growth of code length N, the decoding error rate of Bob tends to 0 while the decoding error rate of Eve tends to 0.5 (namely, the information in the wiretap channel has been compressed to zero). That is to say, lim n→∞ Pr( CW

Simulation results
To prove the feasibility of PCEP algorithm, we conduct a series of simulation experiment on code rate, reliability and security. It should be noticed that the range of p m has been limited to [0, 0.11] because as mentioned in "Polar-code-based efficient post-processing algorithm", only in this range is W * degenerative to W. In all simulation experiment, we set FER target to 0.1 and PAI target to 10 −7 .
Code rate. As shown in Fig. 5, under different code length N, we calculate the code rate. It is observed that with the increase of QBER p m of the main channel, the code rate tends to zero. Moreover, except a single point (where N = 2 20 , p m = 0.03 ), under the same QBER p m , the longer the code length is, the higher the code rate is. This is in accord with the asymptotic property of polar codes. Figure 6 shows the ratio of the practical code rate and the theoretical secure code rate. It can be observed that with the increase of QBER p m , the ratio decreases to zero. The theoretical secure code rate can be regarded as a measurement of the error correcting capability of polar codes, while the practical code rate can be regarded as a measurement of the specific requirement for error correcting capability in certain setting. Therefore, the ratio can be used to measure the extent to which the requirement can be met -the lower the ratio is, the higher the extent is, and hence the better the error correcting performance is. Hence, the lower the ratio is, the higher the decoding reliability should be, which is consistent with the simulation result in "Reliability".
Security: the decoding FER and BER of Eve. According to Eq. (5), the security of PCEP algorithm can be measured by the decoding FER and BER of Eve, which is shown in Figs. 7 and 8. It can be observed that when QBER p m is small, the decoding FER and BER of Eve well satisfies the security condition Eq. (5) ( FER = 1 , BER ∼ 0.5 ), while there is a threshold of QBER beyond which the decoding FER and BER of Eve dramatically decrease to zero. Moreover, the longer the code length, the higher the threshold, which coheres with the asymptotic property of polar codes.    Figure 6. The ratio of the practical code rate and the theoretical secure code rate. www.nature.com/scientificreports/ Reliability. According to Eq. (1), the reliability of PCEP algorithm can be measured by the decoding FER and BER of Bob, which is shown in Figs. 9 and 10. It is observed that the practical decoding FER and BER are satisfying under all code lengths shown in Figs. 9 and 10. Besides, as shown in Fig. 9, the maximum FER in the simulation is around 1 −4 when N = 2 10 and p = 0.01 . Notice that the TFER has been set to 0.1, hence this target is well achieved. Moreover, under different code lengths, the decoding FER and BER of Bob decrease to zero rapidly with the increase of QBER p m . The reason for this counterintuitive phenomenon has been explained in the last paragraph in "Code rate".

Conclusion
In this paper, an efficient QKD post-processing algorithm PCEP which is based on polar codes is proposed. In PCEP algorithm, by analyzing the channel capacity of the main channel and the wiretap channel respectively under the Wyner's wiretap channel model, we design a codeword structure of polar codes, so that the error correction and privacy amplification could be completed synchronously in a single encoding and decoding process. That is to say, PCEP algorithm realizes combining these two post-processing steps into one step. Through this, PCEP algorithm can reduce the complexity and lower the post-processing delay of QKD systems. This provides a new way to develop high-speed QKD systems. To clarify the reliability and security of PCEP algorithm, the reliability and security conditions have deen deduced from the perspective of information theory. Simulation results show that PCEP algorithm well satisfies the reliable and secure communication conditions.   www.nature.com/scientificreports/